Finite-state models for lexical reordering in spoken language translation
نویسندگان
چکیده
The problem of machine translation can be viewed as consisting of two phases: (a) lexical choice phase where appropriate target language lexical items (words or phrases) are chosen for each source language lexical item and (b) reordering phase where the chosen target language lexical items are reordered to produce a meaningful target language string. In earlier work we have shown that nite-state models for lexical choice can be learned from bilingual corpora [6]. In this paper, we focus on stochastic nite-state models for lexical reordering and describe an algorithm to learn them from bilingual corpora. We have developed a stochastic nite-state English-Japanese translation system by composing nitestate lexical choice and lexical reordering model. We have evaluated it using the string edit distance of the translated string from a given reference string. Using this metric, the English-Japanese translation system scored 70.9% on English speech transcriptions.
منابع مشابه
Generalizing Word Lattice Translation
Word lattice decoding has proven useful in spoken language translation; we argue that it provides a compelling model for translation of text genres, as well. We show that prior work in translating lattices using finite state techniques can be naturally extended to more expressive synchronous context-free grammarbased models. Additionally, we resolve a significant complication that non-linear wo...
متن کاملA Finite-State Approach to Machine Translation
The problem of machine translation can be viewed as consisting of two subproblems (a) Lexical Selection and (b) Lexical Reordering. We propose stochas-tic nite-state models for these two subproblems in this paper. Stochastic nite-state models are ee-ciently learnable from data, eeective for decoding and are associated with a calculus for composing models which allows for tight integration of co...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملThe RWTH Aachen German to English MT System for IWSLT 2015
This work describes the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2015. We participated in the MT and SLT tracks for the German→English language pair. We employ our state-of-the-art phrase-based and hierarchical phrase-based baseline systems for the MT track. ...
متن کاملA Reordering Approach for Statistical Machine Translation
This paper presents a Markov based hierarchical reordering scheme for lexical reordering to incorporate into phrase-based statistical machine translation system. The goal is to reorder the words and phrases in source language syntactic structure into their corresponding target language syntactic order for making translation easy. Without reordering during language translation, sentences can onl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000